Reward Design for an Online Reinforcement Learning Algorithm Supporting Oral Self-Care

نویسندگان

چکیده

While dental disease is largely preventable, professional advice on optimal oral hygiene practices often forgotten or abandoned by patients. Therefore patients may benefit from timely and personalized encouragement to engage in self-care behaviors. In this paper, we develop an online reinforcement learning (RL) algorithm for use optimizing the delivery of mobile-based prompts encourage One main challenges developing such ensuring that considers impact current actions effectiveness future (i.e., delayed effects), especially when has been designed run stably autonomously a constrained, real-world setting characterized highly noisy, sparse data. We address challenge designing quality reward maximizes desired health outcome high-quality brushing) while minimizing user burden. also highlight procedure hyperparameters building simulation environment test bed evaluating candidates using bed. The RL discussed paper will be deployed Oralytics. To best our knowledge, Oralytics first mobile study utilizing prevent motivational messages supporting

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Average - Reward Reinforcement Learning

Recently, there has been growing interest in average-reward reinforcement learning (ARL), an undiscounted optimality framework that is applicable to many diierent control tasks. ARL seeks to compute gain-optimal control policies that maximize the expected payoo per step. However, gain-optimality has some intrinsic limitations as an optimality criterion, since for example, it cannot distinguish ...

متن کامل

Reinforcement Learning for Automatic Online Algorithm Selection - an Empirical Study

In this paper a reinforcement learning methodology for automatic online algorithm selection is introduced and empirically tested. It is applicable to automatic algorithm selection methods that predict the performance of each available algorithm and then pick the best one. The experiments confirm the usefulness of the methodology: using online data results in better performance. As in many onlin...

متن کامل

Loss is its own Reward: Self-Supervision for Reinforcement Learning

Reinforcement learning optimizes policies for expected cumulative reward. Need the supervision be so narrow? Reward is delayed and sparse for many tasks, making it a difficult and impoverished signal for end-to-end optimization. To augment reward, we consider a range of selfsupervised tasks that incorporate states, actions, and successors to provide auxiliary losses. These losses offer ubiquito...

متن کامل

An Average-Reward Reinforcement Learning Algorithm for Computing Bias-Optimal Policies

Computing Bias-Optimal Policies Sridhar Mahadevan Department of Computer Science and Engineering University of South Florida Tampa, Florida 33620 [email protected] Abstract Average-reward reinforcement learning (ARL) is an undiscounted optimality framework that is generally applicable to a broad range of control tasks. ARL computes gain-optimal control policies that maximize the expected pa...

متن کامل

Evolved Intrinsic Reward Functions for Reinforcement Learning

The reinforcement learning (RL) paradigm typically assumes a given reward function that is part of the problem being solved by the agent. However, in animals, all reward signals are generated internally, rather than being received directly from the environment. Furthermore, animals have evolved motivational systems that facilitate learning by rewarding activities that often bear a distal relati...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2023

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v37i13.26866